Skip to content

fix: keep UDS peer failures structured#375

Merged
claude-code-best merged 4 commits intomainfrom
codex/coderabbit-final-review-fixes
Apr 27, 2026
Merged

fix: keep UDS peer failures structured#375
claude-code-best merged 4 commits intomainfrom
codex/coderabbit-final-review-fixes

Conversation

@amDosion
Copy link
Copy Markdown
Collaborator

@amDosion amDosion commented Apr 27, 2026

Summary

Follow-up after #374 was squash-merged before the final review fixes landed.

This PR carries only the final, clean delta on top of current main:

  • Keep UDS send/connect timeout and socket-error paths on UdsPeerConnectionError.
  • Hand connected raw sockets back to callers without retaining the internal timeout/error listener.
  • Tighten real-path tests for UDS timeout/connect failure, token non-leakage, AgentSummary debug assertions, and platform-specific mailbox directory errno.

Verification

  • bun test src/utils/__tests__/udsMessaging.test.ts src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/teammateMailbox.test.ts
  • bunx tsc --noEmit --pretty false
  • bun run lint
  • bun run test:all
  • bun test --coverage --coverage-reporter lcov --coverage-dir coverage
  • bun run build
  • bun run build:vite
  • Claude simplify review artifact: .omx/artifacts/claude-review-only-cross-check-for-pr-374-on-branch-codex-codecov-r-2026-04-27T08-17-47-309Z.md
  • Claude security review artifact: .omx/artifacts/claude-security-review-cross-check-for-pr-374-current-working-tree--2026-04-27T08-26-54-079Z.md

Summary by CodeRabbit

  • Tests

    • Made debug log assertions tolerant (partial matching), added platform-aware errno checks, increased timeout values, and broadened failure/connection coverage for local peer messaging.
  • Refactor

    • Made peer-connection timeout configurable and hardened connection/error handling so callers retain control of established sockets and post-connect error handling.

CodeRabbit and Claude cross-review identified that timeout and raw peer connection failures should share one observable error contract. UDS peer failures now use UdsPeerConnectionError consistently, and connectToPeer hands the socket lifecycle back to the caller after a successful connection instead of retaining an internal timeout or error listener.

The tests cover the real socket paths with capability files, timeout behavior, connection failure structure, post-connect listener handoff, AgentSummary rescheduling observations, and platform-specific mailbox directory errno handling.

Constraint: Preserve the 5000ms production timeout default while allowing tests to exercise timeout paths quickly.

Rejected: Suppress CodeRabbit warnings in tests | would hide the real timeout/error contract gap.

Rejected: Keep connectToPeer post-connect error listener | it would silently swallow caller-owned socket errors.

Confidence: high

Scope-risk: narrow

Directive: Keep UDS send/connect timeout and socket-error paths on the same structured peer error contract.

Tested: bun test src/utils/__tests__/udsMessaging.test.ts src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/teammateMailbox.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Tested: omx ask claude simplify review artifact .omx/artifacts/claude-review-only-cross-check-for-pr-374-on-branch-codex-codecov-r-2026-04-27T08-17-47-309Z.md

Tested: omx ask claude security review artifact .omx/artifacts/claude-security-review-cross-check-for-pr-374-current-working-tree--2026-04-27T08-26-54-079Z.md

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.
@mintlify
Copy link
Copy Markdown

mintlify Bot commented Apr 27, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
ccb-863780bf 🟢 Ready View Preview Apr 27, 2026, 8:36 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b714ecff-7c35-40f4-a201-54dda97e1127

📥 Commits

Reviewing files that changed from the base of the PR and between a90c164 and d3b16ae.

📒 Files selected for processing (2)
  • src/utils/__tests__/udsMessaging.test.ts
  • src/utils/udsClient.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/utils/tests/udsMessaging.test.ts

📝 Walkthrough

Walkthrough

Refactors UDS client to accept an onSocketError callback and configurable timeout; centralizes pre-connect failure handling. Updates tests: AgentSummary log checks use substring matching, teammateMailbox errno expectation is platform-aware, and udsMessaging tests expand timeout and add connectToPeer lifecycle/error expectations.

Changes

Cohort / File(s) Summary
AgentSummary tests
src/services/AgentSummary/__tests__/agentSummary.test.ts
Replaces strict expect(debugLogs).toContain(...) assertions with a helper that checks debugLogs.some(msg => msg.includes(fragment)) for tolerant substring matching.
Mailbox errno test
src/utils/__tests__/teammateMailbox.test.ts
Makes expected errno from writeToMailbox platform-dependent: non-Windows expects EISDIR; Windows still accepts EISDIR, EPERM, EACCES.
UDS messaging tests
src/utils/__tests__/udsMessaging.test.ts
Increases send timeout (50→200 ms); adds tests asserting sendToUdsSocket timeout produces UdsPeerConnectionError; adds connectToPeer tests for rejection preserving socketPath, and for validated socket ownership, listener wiring, and explicit teardown behavior after connect.
UDS client implementation
src/utils/udsClient.ts
Changes signature to connectToPeer(socketPath: string, onSocketError: (error: Error) => void, timeoutMs = 5000): Promise<Socket>; implements settled guard, centralized fail(cause) that removes listeners, clears timeout, destroys socket and rejects with UdsPeerConnectionError(socketPath, cause); routes pre-connect errors/timeouts to fail; on successful connect attaches onSocketError before resolving.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant UdsClient
  participant Socket
  participant Timeout

  Caller->>UdsClient: connectToPeer(socketPath, onSocketError, timeoutMs)
  UdsClient->>Socket: create socket, socket.connect(socketPath)
  UdsClient->>Timeout: start timeout timer
  par pre-connect error
    Socket-->>UdsClient: 'error' (pre-connect)
    UdsClient->>Socket: remove listeners, destroy()
    UdsClient-->>Caller: reject UdsPeerConnectionError(socketPath, cause)
  and timeout expires
    Timeout-->>UdsClient: timeout expiry
    UdsClient->>Socket: remove listeners, destroy()
    UdsClient-->>Caller: reject UdsPeerConnectionError(socketPath, timeout)
  end
  opt connect success
    Socket-->>UdsClient: 'connect'
    UdsClient->>Timeout: clear()
    UdsClient->>Socket: remove internal pre-connect listeners
    UdsClient->>Socket: attach onSocketError handler
    UdsClient-->>Caller: resolve Socket
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐇 I checked the socket, set a gentle beat,
Logs now whisper fragments, tidy and neat,
Mailboxes bow to platforms, polite and true,
Connections hand control back to you.
A little hop for safer, clearer glue.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: keep UDS peer failures structured' accurately summarizes the main change: preserving UDS send/connect timeout and socket-error paths under a single structured error type (UdsPeerConnectionError), which is the core objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/coderabbit-final-review-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 27, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/utils/udsClient.ts (1)

271-297: Clean handoff of socket lifecycle to caller.

The shared fail path with the settled guard covers both the 'error' and timeout races correctly: once 'connect' resolves, setTimeout(0) disables the inactivity timer (so the timeout callback can no longer fire) and off('error', fail) detaches the internal listener so caller-owned errors aren't swallowed. The reciprocal path (timeout/error → fail destroys + rejects) is also single-shot via settled.

One consequence worth highlighting (consistent with the PR's stated design decision): between resolve(conn) and the caller attaching its own 'error' handler, the socket has no error listener, and an unhandled 'error' event would surface as an uncaught exception. Callers must attach an error handler synchronously upon receiving the socket. Both current call sites in the new tests satisfy this implicitly via the test scaffolding, but a brief note in the JSDoc would make the contract explicit for future callers.

📝 Suggested doc tightening
 /**
  * Connect to a peer and return the raw socket for bidirectional communication.
- * The caller is responsible for managing the connection lifecycle.
+ * The caller is responsible for managing the connection lifecycle, including
+ * attaching an `'error'` listener immediately upon receiving the socket —
+ * this function detaches its internal error listener on successful connect
+ * so caller-owned errors are not silently swallowed.
  */
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsClient.ts` around lines 271 - 297, The connectToPeer function
hands a raw Socket (created via createConnection) to the caller once 'connect'
fires and removes the internal 'error' listener, which can leave the socket
without any error handler until the caller attaches one; update the JSDoc for
connectToPeer to explicitly document this lifecycle contract: state that callers
must synchronously attach an 'error' listener immediately after the Promise
resolves (or use try/catch around await and attach before any I/O) because
unhandled 'error' events on the socket will throw, and reference connectToPeer,
createConnection, and UdsPeerConnectionError to make the expectation clear.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/__tests__/udsMessaging.test.ts`:
- Around line 322-358: The test 'connectToPeer leaves connected socket lifecycle
to the caller' uses connectToPeer(path, 50) with a 50ms timeout which is too
tight and causes flakiness on slow CI; update the call to use a larger timeout
(e.g., 1000ms) so the test still checks post-connect lifecycle while avoiding
transient UdsPeerConnectionError failures from slow connects, i.e., change the
literal 50 in the connectToPeer invocation in this test to 1000 (referencing the
connectToPeer import from ../udsClient.js and the test block named
'connectToPeer leaves connected socket lifecycle to the caller').

---

Nitpick comments:
In `@src/utils/udsClient.ts`:
- Around line 271-297: The connectToPeer function hands a raw Socket (created
via createConnection) to the caller once 'connect' fires and removes the
internal 'error' listener, which can leave the socket without any error handler
until the caller attaches one; update the JSDoc for connectToPeer to explicitly
document this lifecycle contract: state that callers must synchronously attach
an 'error' listener immediately after the Promise resolves (or use try/catch
around await and attach before any I/O) because unhandled 'error' events on the
socket will throw, and reference connectToPeer, createConnection, and
UdsPeerConnectionError to make the expectation clear.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eaa62c50-57f8-41c1-9670-5ae7d5df699c

📥 Commits

Reviewing files that changed from the base of the PR and between b47731a and 85dc1b9.

📒 Files selected for processing (4)
  • src/services/AgentSummary/__tests__/agentSummary.test.ts
  • src/utils/__tests__/teammateMailbox.test.ts
  • src/utils/__tests__/udsMessaging.test.ts
  • src/utils/udsClient.ts

Comment thread src/utils/__tests__/udsMessaging.test.ts
CodeRabbit's #375 pass found that connectToPeer now correctly hands socket errors to the caller, but the JSDoc needed to spell out that contract. The lifecycle test also uses a less brittle post-connect timeout so slow CI does not turn the ownership check into a connection-speed race.

Constraint: The raw socket API intentionally detaches its internal listener after successful connect so caller-owned errors are not swallowed.

Rejected: Keep the test timeout at 50ms | it tests scheduler speed instead of socket lifecycle ownership.

Confidence: high

Scope-risk: narrow

Directive: connectToPeer callers must attach their own error listener immediately after awaiting the socket.

Tested: bun test src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: git diff --check

Tested: bun run test:all

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.
CodeRabbit and Claude review found that documenting caller-owned raw socket errors still left a Promise handoff window and a stale timeout-listener risk. The peer connection API now requires a caller error handler and installs it before resolving, while cleanup removes internal error and timeout listeners on every path.

Constraint: Keep the fix precise to PR #375 review feedback and avoid warning suppression or fallback behavior.
Rejected: Leave the behavior documented only | still permits an unhandled socket error window between resolve and caller listener attachment.
Rejected: Keep a no-op internal error listener | would silently swallow caller-owned socket errors.
Confidence: high
Scope-risk: narrow
Directive: Do not add raw connectToPeer callers without providing a real onSocketError handler and capability handshake.
Tested: bun test src/utils/__tests__/udsMessaging.test.ts src/services/AgentSummary/__tests__/agentSummary.test.ts
Tested: bunx tsc --noEmit --pretty false
Tested: bun run lint
Tested: bun run test:all
Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage
Tested: bun run build
Tested: bun run build:vite
Tested: bun audit
Not-tested: Manual external ACP peer runtime beyond repository tests.
The raw socket handoff no longer needs Socket#setTimeout; an ordinary connection deadline keeps the timeout behavior while avoiding an internal socket timeout listener that has no reliable UDS integration path to exercise.

Constraint: Keep Codecov coverage honest without adding ignore pragmas, mocks, or fallback suppression.

Rejected: c8 ignore on the timeout listener | hides the uncovered branch instead of simplifying the lifecycle.

Rejected: keep Socket#setTimeout listener | leaves a socket listener lifecycle to manage for a connect-only deadline.

Confidence: high

Scope-risk: narrow

Directive: Keep connectToPeer errors caller-owned via onSocketError and reject pre-connect failures with UdsPeerConnectionError.

Tested: bun test src/utils/__tests__/udsMessaging.test.ts src/services/AgentSummary/__tests__/agentSummary.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test src/utils/__tests__/udsMessaging.test.ts --coverage --coverage-reporter lcov --coverage-dir coverage-uds

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Tested: bun audit

Not-tested: Manual external ACP peer runtime beyond repository tests.
@claude-code-best claude-code-best merged commit 4266149 into main Apr 27, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants